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I.  IHTEODOCTION 


As  a  young  and  developing  information  system  organiza- 
tion, the  Indonesian  Aray  Data  Collecting  and  Processing 
Service  (EISPULLAHTAE)  has  a  tremendous  proliferation  of 
application  files.  It  is  no  surprise  that  there  is  much 
redundancy  of  data  and  efforts.  Data  redundancy  vastes  a 
limited  resources,  and  furthermore,  it  raises  the  prcblea  of 
inconsistent  data,  that  is,  the  same  element  of  data  having 
different  values  within  different  files.  The  implementation 
of  datahase  management  system  (DBMS)  could  handle  this 
problem  by  providing  more  control  and  more  effective 
management  of  data. 

Cn  one  hand,  the  information  generated  electronically 
becomes  more  and  more  in  demand  to  the  point  where  it  has 
become  a  critical  issue  for  the  Indonesian  Army.  Cr  the 
other  hand,  the  personnel  generating  and  maintaining  this 
information  move  dynamically  because  of  requirements  for 
military  tour  of  duty  and  tour  of  area.  This  situation 
creates  problems  in  keeping  accurate  and  up  to  date  informa- 
tion. Even  though  the  manual  documentation  is  always  done 
properly,  this  is  not  always  adequate.  It  is  often  the  case 
that  many  applications  are  highly  dependent  up  on  the 
personnel  responsible  for  such  applications.  Standardized 
and  centralized  documentation  is  a  "must",  especially  when  a 
DBMS  is  implemented.  In  this  regard,  the  data  dictionary  is 
a  powerful  vehicle  that  supports  such  documentation. 

According  to  Dolk  [Eef,  1],  "a  data  dictionary  is  a 
collection  of  an  enterprise's  meta-data  designed  as  one  or 
more  databases  which  can  be  retrieved  and  analyzed  using 
standard  database  management  system  capabilities".  This 
will  te  discussed  further  in  a  subsequent  chapter,   and  will 


te  considered   as  a  tasis   in  choosing  the   most  appropriate 
EBMS  to  te  implemented  by  DISPULL AHTAD. 

The  organizational  structure  of  DISPOLLAHTAD,  its  system 
configuration,  and  its  current  various  applications  will  be 
describes  briefly  in  order  to  provide  a  background  for  the 
succeeding  chapters.  A  discussion  of  database  management 
system  and  a  recommendation  of  the  most  appropriate  D3I1S  to 
be  i  irplemented  comprises  the  last  chapter. 


II.  IHDCNESIAN  IRHI  IMA  COLLECTING  AND  PROCESSING  SERVICE 

(DISPOLLAHIAD) 

A.   CEGANIZATION,  TASK,  AND  SYSTEM  CONFIGURATION 

DISPDLLAHTAD  is  an  acronym  in  the  Indonesian  language 
that  stands  for  the  Indonesian  Army  Data  Collecting  and 
Processing  Service.  It  was  initiated  in  Fiscal  Year 
1973/1974  and  formally  organized  in  Fiscal  Year  1975/1976 
[Ref.  2].  DISPULLAHTAD  is  located  in  the  Indonesian  Any 
Headquarter  -  Jakarta,  the  capital  city  of  the  Republic  of 
Indonesia, 

The  DISPULLAHTAD's  main  task  is  to  provide  all  informa- 
tion processed  electronically  for  all  organizational 
elements  of  the  Army  requiring  the  information  [Ref-  2].  In 
order  to  be  able  to  accomplish  this  task,  DISPULLAHTAD  is 
equipped  with  several  computer  configurations.  As  of  19SU, 
these  include  an  IBM  System  4341,  an  IBM  System  370,  several 
IBM  System  3740s,  and  several  TRS-80s. 

In  the  lower  organizational  level  such  as  Military  Area 
Commands  (KODAM) ,  Army  Finance  Service  (JANKQAD) ,  Army 
Administrative  and  Personnel  Service  (JANMINPERS AD) ,  Army 
Development  and  Educational  Command  (KOBANGDIKLAT) ,  etc., 
each  has  its  own  Data  Processing  Service  and  it  is  called  as 
PULLAHTA  KOTAMA/LAKPCS  or  PULLAHTA  for  short.  A  simplified 
organizational  structure  of  the  Indonesian  Army  is  presented 
as  figure  2. 1  and  the  position  of  both  DISPULLAHTAD  and 
PULLAHTAs  can  be  clearly  visualized. 

Each  PULLAHTA  has  two  roles:  first  to  process  and 
provide  all  pertinent  information  requested  by  the  organiza- 
tion to  which  it  is  attached,  and  secondly,   to  provide  data 
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IndoD€sian  Army  Organizational  Structure 
(Simplified  Chart) . 


entry  for  all  applications  centrally  processed  by 
EISPUIIAHTAC.  In  order  to  accomplish  these  objectives,  each 
PULLAHTA  is  also  equipped  with  some  hardware  as  shown  below. 
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T •   rULLAHTAs  inside  Java  Island 

EUILAHTAs  that  belong  to  Military  Area  Commar.d  in 
Java  island  are  equipped  with  an  IBM  System  4331  connected 
to  the  IBM  System  4341  at  DISPDILAHTAD  in  "online"  mode  via 
dedicated  public  telephone  lines.  This  is  the  first  stage  in 
networking  all  PULLAHIAs  throughout  the  country.  The  data 
interchange  is  done  electronically  through  these  dedicated 
lines. 

2-   iOilMIAs  outside  Java  Island 

PDLLAHTAs  sited  outside  Java  island  are  equipped 
with  an  IBM  System  3740  and  currently  work  in  "off  line" 
mode.  Eventually  they  will  be  connected  to  DISPULLAHTAC  via 
dedicated  public  telephone  lines.  The  data  interchange  is 
done  manually  using  floppy-disks  transported  via  airline  and 
it  requires  one  to  two  days  for  the  data  to  reach  its 
destination. 

B.   APPLICATIONS 

All  applications  done  by  DISPULLAHTAD  are  in  order  to 
fulfill  its  task  of  electronically  providing  information 
needed  by  the  Indonesian  Army.  These  applications  are  sepa- 
rated into  three  kinds  of  Management  Information  Systems 
[Ref.  3]  : 

1 .   Administraticn  Management  Information  System 

This  category  includes  applications  in  finance, 
logistics,  personnel,  and  all  applications  pertinent  to 
development,  education,  and  corps/specialty. 

2  .   Military  Management  In  for mat ion  System 

The  applications  of  intelligence  and  security, 
territorial,  communication  and  electronics,  and  organiza- 
tion, operation,  and  training  are  included  in  this  category. 
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3  •   glanninq   and  Controlling    Management   I nf crmation 
System 

Three  kinds  cf  applications  are  included  in  this 
category:  planning  and  budgeting,  auditing  and  control,  and 
command,  control,  and  communication, 

C.  DATA  EEDONDANCY  EEOBLEMS 

There  are  many  possible  data  redundancies  within  those 
applications.  For  instance  consider  data  about  name,  rank, 
SSN,  corps,  occupation,  etc.,  that  belong  to  an  individual 
assigned  as  Intelligence  Officer  in  a  Territorial  Unit.  His 
data  will  appear  in  at  least  four  different  files: 
personnel  file,  payrcll  file,  intelligence  file,  and  terri- 
torial file.  If,  for  instance,  there  is  a  change  to  just 
cne  of  those  data  elements,  redundant  effort  is  required  to 
update  all  those  four  files  and  a  high  level  cf  data 
inconsistency  may  result. 

It  happens  many  times  that  top  level  management  detects 
data  inconsistency  in  two  different  reports  generated  by 
DISPUILAHTAD  (e.g.  the  total  number  of  personnel  appears 
differently  in  the  personnel  report  and  the  payroll  repcct) . 
This  has  become  very  annoying,  and  tremendously  reduces 
credibility  in  the  computer  system.  In  this  regard,  an 
effort  should  be  made  to  eliminate  the  problem  and  one  way 
of  doing  that  is  by  designing  and  implementing  a  data 
dictionary  system  (DDS)  in  concert  with  a  DBMS. 

D.  STAGED  DEVELOPMENT  APPROACH 

The  design  of  any  information  system  is  the  most  diffi- 
cult and  critical  step.  It  should  be  done  with  great  care 
and  full  awareness.  As  suggested  by  Sprague  and  Carlson 
[fief.  U],  there  are  three  different  approaches  in  initiating 
an  information  system: 
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•  Quick-Hit  approach. 

This  approach  should  be  done  if  there  is  no  clarifica- 
tion whether  such  an  information  system  is  needed  or  not, 
tut  there  is  a  recognized  high  payoff  for  initiating  the 
system.  This  approach  requires  developing  the  system  in  the 
most  beneficial  area,  capturing  the  benefits,  and  then 
considering  what  to  dc  next. 

.  •  Staged  DevelopiDeiit  approach. 

This  approach  is  done  by  developing  the  system  in  the 
most  beneficial  area  as  in  the  quick-hit  approach,  bat  with 
some  advanced  and  clear  planning.  Therefore,  part  of  the 
effort  in  developing  the  first  system  can  be  reused  in 
developing  the  second.  This  approach  is  very  appropriate  for 
initiating  an  information  system  clearly  supported  by  top 
management,  but  with  limited  available  resources. 

•  Complete    System  approach. 

This  approach  requires  the  longest  development  time  and 
highest  development  costs  before  any  benefits  are  attained. 
Before  building  any  part  of  the  system,  a  full-service 
system  generator  and  the  organizational  structure  for 
managing  it  must  be  developed  first.  In  this  regard,  this 
approach  represents  the  most  risky  option. 

For  DISPULLAHTAD ,  it  is  anticipated  that  top  management 
will  strongly  support  the  implementation  of  a  DBMS,  there- 
fore the  Quick-Hit  approach  is  not  necessary.  On  the  other 
hand,  limited  computer  resources  makes  a  Complete  System 
approach  infeasible,  too.  Hence,  the  most  appropriate 
approach  is  the  Staged  Development  approach.  In  the  Staged 
Development  approach,  identifying  the  functional  area  where 
there  is  a  reason  to  expect  the  highest  pay-off  in  starting 
the  project  is  a  crucial  thing.  Using  the  hignest  volume  of 
transaction  as  the  criteria,  and  by  evaluating  the  trans- 
actional data  gathered  between  April  1983  and  December  1983 
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(see  Figure  2.2),  there  are  three  applications  having  high 
transactional  volume:  personnel,  payroll,  and  finance  leport 
applications.  Only  the  personnel  and  payroll  applications 
have  a  master  file  which  is  maintained  and  used  continu- 
ously. Eesides  that,  these  two  applications  are  the  most 
crucial  in  maintaining  personnel  morale  and  the  most  often 
used  in  relation  to  the  personnel  management  task. 

Based  on  these  evaluations,  data  used  by  the  perscnncl 
and  payroll  applications  will  be  the  first  database  to  be 
implemented  by  DISPDLIAHTAD.  These  applications  include 
personnel,  payroll,  intelligence  personnel,  and  territorial 
personnel.  It  is  also  implied  that  the  discussion  on  the 
DDS  and  CBMS  will  be  limited  to  those  applications. 

The  plan  for  this  staged  development  approach  are: 

1.  Initial  design  of  DD  covering  personnel,  payroll, 
intelligence  personnel,  and  territorial  personnel  applica- 
tions . 

2.  Implementation  of  personnel  and  payroll  database. 

3.  Design  of  DD  for  all  data  used  by  the  rest  of  appli- 
cations excluded  at  the  first  stage. 

U.  Implementation  of  other  databases. 
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III.  DATA  EICTIONARY 

A.   GINEBAL 

The  revolutionary  change  in  computer  technology  has 
created  another  challenge  on  hew  to  organize  and  manage  the 
very  large-scale  databases  made  possible  by  the  combination 
of  database  management  systems  (DBMS)  and  powerful  new  hard- 
ware systems.  The  need  to  control  the  enterprise's  data 
becomes  critical  due  to  the  proliferation  of  microcomputers 
that  trigger  more  and  more  applications  which  in  turn 
creates  redundancy  and  data  inconsistency  problems.  At  the 
same  time,  the  number  of  microcomputer  users  demar.dirg 
direct  access  to  the  enterprise  data  is  also  increasing. 
This  direct  access  to  large  and  complex  databases  again 
creates  a  problem  of  how  to  "coordinate"  and  control  these 
complex  information  structures. 

Data  redundancy,  data  inconsistency,  and  the  need  to 
control  the  enterprise's  data  lead  to  the  design  and  imple- 
mentation of  database  systems.  The  database  env-ircnment 
itself  assumes  an  architectural  plan  designed  to  minimize 
redundancy  and  to  emphasize  accessibility.  It  assumes 
logical  and  physical  structures  aimed  at  separate  objec- 
tives. It  also  assumes  that  individual  file  may  serve  many 
different  applications.  All  of  this  is  far  too  much 
complexity  to  be  managed  without  precise  and  up-to-date 
documentation  and  control.  The  data  dictionary  is  designed 
to  define  all  appropriate  aspects  of  the  enterprise's  data, 
so  that  it  can  be  used  as  a  tool  to  control  and  manage  the 
database  system  no  matter  how  great  its  size  and  how  complex 
its  structures. 
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B.   INFCEMATION  RESODBCE  MANAGEMENT  (IRM) 

The  concept  of  lEK  is  that  information  is  a  vital  enter- 
prise asset  that  should  be  invested  in,  and  used  like  ether 
resources  [Ref.  5]. 

IRM  is  the  task  of  managing  information  resources  such 
as  data,  processes,  users,  software,  and  hardware  in  an 
integrated  and  coordinated  manner.  IP.yi  includes  all  manage- 
ment asjrects  of  the  information-related  operations  of  an 
organization,  such  as  policy  formulation,  resource  alloca- 
tion, implementation,  and  control. 

A  definition  of  IFM  was  formulated  at  a  workshop  on  Data 
Dictionary  Systems  and  Information  Resource  .Management  spon- 
sored by  the  Association  for  Computing  Machinery  and  the 
National  Bureau  of  Standards  in  1980; 


Information  Resource  Management  is  whatever  policy, 
action,  or  procedure  concerning  information  (both  auto- 
mated and  non-autcmated)  which  management  establishes 
that  serve  the  overall  current  and  future  needs  of  the 
enterprise.  Such  policies,  etc. ,  would  include  consid- 
erations of  availability,  timeliness,  accuracy,  integ- 
rity, privacy,  security,  auditability ,  ownership,  use, 
and  cost-ef f ectiveressi 


This  definition  cf  IRM  was  chosen  t'o  emphasize  the 
enterprise-wide  nature  of  planning  and  execution  of  informa- 
tion policies,  actions,  and  procedures  in  order  that  data 
can  be  treated  as  a  true  resource.  It  also  reflects  the 
primary  shift  of  data  processing  uses  from  processing- 
centered  design  methodologies  to  data-centered 
methodologies. 

1  •   Data  Dictionary  as  t he  Tool  of  IRM 

Cne  of  the  problems  encountered  in  IRM  is  the  vast 
amount  cf  data  about  information  resources  reguired  to 
managed  the  enterprise  data,  together  with  the  very  coirplex 
and  numerous  relationships  existing   between  them.    This  i3 
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precisely  the  sort  of  task  that  a  Data  Dictionary  System  can 
te  made  to  do,  provided  that  it  has  been  conditioned  to  know 
how  to  deal  not  just  vith  data  entities  or  process  entities, 
tut  with  the  entire  range  of  information  resource  entities. 

2 .  Data  Dictionary 

A  data  dictionary  is  a  collection  of  meta-data  (data 
about  the  enterprise  data)  that  could  consist  of:  the  name 
of  data  (including  its  synonyms  and/or  homonyms) ,  the  loca- 
tion of  data,  a  description  of  the  meaning  of  data,  the 
relation  between  data,  how  the  data  is  used,  who  is  respon- 
sible for  the  data,  the  source  of  the  data,  etc. ,  in  short, 
a  store  of  all  the  apfropriate  information  about  the  data. 

Recently,  there  has  been  a  trend  towards  using  data 
dictionary  to  include  the  following  functions: 

a.  Definition  of  other  data  constructs  such  as 
records  and  files. 

b-  Definition  of  processes  such  as  programs  or 
manual  processes. 

c.  Definition  of  data  users  whether  individuals  or 
organizational  entities. 

Along  with  these  definitions,  the  data  dictionary 
also  began  to  be  used  to  document  the  cross-references 
between  them  and  to  record  their  usage  and  organizational 
responsibilities. 

3 .  Data  Dictionary  System  (DDS) 

A  Data  Dictionary  System  is  a  combination  of  soft- 
ware and  procedures  that  aid  an  enterprise  in  setting  up  and 
maintaining  its  complex  structure  of  data  resources.  The 
software  itself  may  be  produced  in-house  or  acquired  from 
software  vendors.  For  the  following  three  reasons,  it  is 
often  better  to  purchase  instead  of  building  it  in-house: 
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a.   Design  and  Implementation 

The  task  of  designing  and  impleaenting  a  DDS, 
even  cne  of  modest  functionality,  is  definitely  a  non- 
trivial  cne.  There  exists  good  potential  that  the  magnitude 
of  the  task  will  te  underestimated  and  that  greater 
resources  will  be  needed  than  those  originally  estimated. 

t.   Gaining  a  Success 

If  the  use  of  DDS  is  to  be  at  all  successful, 
the  software  itself  must  conform  to  high  standards  of 
quality  assurance.  The  use  of  the  same  software  at  aany 
other  installations,  as  is  the  case  with  a  commercial 
package,  aids  in  the  early  discovery  of  possible  software 
errors  and  their  corrections. 

c.   Technology  Progress 

There  are  good  reasons  for  assuming  that  DDS 
technology  will  continue  to  progress  and  that  substantial 
enhancements  will  significantly  increase  the  usefulness  and 
value  of  the  DDS,  and  it  will  be  difficult  for  an  in-house 
system  to  keep  pace. 

There  are  several  DDS  commercially  available 
now,  such  as  DB/DC  Data  Dictionary  of  IBM,  DATAMANAGER  by 
MSP  Inc.,  Integrated  Data  Dictionary  (IDD)  by  Cullinet 
Software  Inc.,  DATADICTIONAR Y  by  Applied  Data  Research  Inc., 
Extended  Data  Dictionary  (XDD)  by  Intell  Systems  Corp.  ,  'JCC 
TEN  by  University  Computing  Company,  and  Data  Control  System 
(DCS)  by  Cincom  Systeas  Inc. 

In  the  following  section,  features  that  one 
could  expect  to  fird  on  those  available  DDS  will  be 
discussed.  It  must  be  pointed  out  that  none  of  the  avail- 
able DDSs  mentioned  above  will  necessary  have  all  explained 
features,  and  there  is  no  such  implication  that  all  of  these 
features  are  required  in  all  DDS  applications. 
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C.       FEATDEES   OF    A    DATA    DICTIONARY 

There  are  several  issues  will  be  discussed  here,  such  as 
Impleirertation  and  Architectural,  Data  Dictionary  (DD)  and 
DD  Schema,  Extensitility  Facilities,  Status  Facilities, 
Dictionary  Commands,  Fridge  Facilities,  and  Data  Dictionary 
System   Security. 

'^  •      Implementation   and   Archit ect ur a  1    Issues 

a.      The    Relationship    Between    DDS    and   DBMS 

The  primary  purpose  of  a  DBliS  is  to  manage  data, 
whereas  the  primary  purpose  of  a  DDS  is  to  manage  meta-data. 
Therefore,  it  is  clear  that  there  is  a  very  little  overlap 
tetween  these  two,  in  fact  they  are  complementary;  both 
functions  are  recjuired  for  proper  management  of  information 
resources. 

Some  of  the  functions  a  DDS  will  perform  are  in 
support  of  one  or  more  DBMSs.  This  is  to  be  expected  as  the 
DDS  will  manage  all  meta-data,  including  meta-data  where  the 
actual  instances  of  data  are  stored  in  a  database,  which  in 
turn  is  being  managed  by  a  DBMS,  An  element  required  for 
this  latter  function,  the  DBMS*s  management  of  data  in  data- 
bases, is  the  knowledge  on  the  part  of  the  DBMS  of  certain 
meta-data  of  the  databases  which  are  required  by  the  DBMS  in 
order      for    it      to      do   its      processing.  This   meta-data      is 

commonly  referred  to  as  the  DBMS-Directory,  and  it  should  be 
clear  that  this  potentially  is  one  area  of  overlap  between 
the  DES  and  DBMS.  In  this  sense,  it  is  preferable  to  design 
a  DDS  prior  to  the  DEJfS  implementation  rather  than  to  build 
a  DDS  that  has  to  be  fitted  toward  an  existing  DBMS.  This 
reason  together  with  an  existing  method  of  implementing  a 
DDS  as  one  of  the  EEMS's  applications  may  explain  why  in 
this  thesis  designing  the  dictionary  is  done  prior  to  the 
design   of   the   databases. 
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t.   The  Method  of  DDS  I iplementa tion 

It  is  preferable  to  design  a  data  dictionary 
prior  to  the  implementation  of  the  DB?1S.  On  the  other  hand, 
the  iiplementation  of  that  designed  data  dictionary  as  a 
complete  DDS  is  a  good  candidate  to  be  one  of  the  DEMS 
applications,  and  indeed  a  number  of  existing  DDS  are  icple- 
mented  in  this  manner.  But  this  is  not  a  single  option,  the 
implementation  of  DDS  can  be  either  : 

•  DBMS-dependent  system. 

This  is  a  DDS  that  uses  a  DBMS  in  its  iaplemen- 
taticn. 

•  Free-standing    system. 

A  DDS  that  doesn't  use  a  DBMS  in  its  i  irplementa- 
tion    is   considered    to  be   included   in    this   category. 

There  is  no  ultimate  answer  as  to  whether  a 
free-standing  or  DBMS-dependent  DDS  is  the  best.  There  are 
both  pros  and  cons  to  this  question,  and  these  depend  on  the 
enterprise's  specialized  circumstances,  such  as  whether  the 
enterprise  implements  a  DBMS  or  not,  and  whether  it  uses 
multiple  DEMSs  or  a  single  DBMS  for  its  databases. 
Enterprise  (s)  that  have  no  intention  to  implement  a  DEMS  but 
DDS  will  give  a  favor  toward  a  free-standing  system.  On  the 
other  hand,  the  enterprise (s)  having  multiple  DEMS  may 
implement  a  DBMS-dependent  system  and  choose  one  of  its 
DBMSs  to  implement  it;  or,  they  may  implement  a  free- 
standing system  in  order  to  provide  more  flexible  and  fair 
control. 

Other  considerations  include  the  DDS  security 
and  a  view  that  the  scope  of  DDS  usage  is  substantially 
broader  than  the  DDKS  environment.  With  a  DBMS-dependent 
system,  personnel  familiar  with  the  use  of  the  DBMS  may  find 
it  easier  to  break  the  DDS  security  than  would  be  the  case 
with    a    free-standing    system. 
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Tor  the  sake  of  conpleteness,  there  is  ancther 
concept  of  DDS  implementation  method  referred  to  as  inte- 
grated DCS.  This  method  offers  an  elimination  of  overlap 
hetween  data  dicticnary  and  DBMS-Directory  by  combining 
these  features  into  cne.  The  advantage  gained  by  combining 
these  two  features  is  that  redundancy  of  storing  the 
meta-data   is   eliminated. 

c.      Active    and   Passive    DDS 

In  the  processes  that  require  meta-data  for  its 
execution,  there  should  be  a  command  or  series  of  commands, 
representing  some  DDS  functionality  that  produces  the 
required  meta-data.  This  functionality  is  called  dictionary 
interface,  and  there  are  two  kinds  of  such  interface:  active 
and    passive   interface. 

An  active  interface  means  that  all  processes 
that  require  meta-data  will  use  the  most  current  meta-data 
in  the  data  dictionary.  Similarly,  all  processes  which  in 
the  course  of  their  execution  generate  meta-data  are 
required  to  store  the  generated  meta-data  in  the  data 
dicticnary. 

On  the  other  hand,  the  passive  interface  will 
have  all  that  an  active  interface  has  to  do  as  the  option. 
The  other  option  for  all  processes  that  require  meta-data  in 
its  execution  are  either  retrieve  it  from  data  dictionary  or 
some  other  locations;  or  if  the  process  already  contains  the 
meta-data,  there  exist  an  option  for  the  system  to  check 
whether  or  rot  this  neta-data  is  the  most  current  version  in 
the  data  dictionary.  In  the  case  of  generating  a  meta-data, 
the  process  also  has  an  option  to  store  or  not  to  store  the 
generated   meta-data. 

Therefore,  two  conclusions  can  be  drawn  about 
the   dictionary   interface    : 
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(1)  A  DES  may  have  some  interfaces  which  are 
active  and  others  which  are  passive 

(2)  The  fact  that  an  interface  is  active  is  a 
property  not  only  of  the  DDS,  but  also  of  the  overall  system 
of  which  the  DDS  is  a  part. 

2 •   Ca ta  Dicti onarv  and  Data  Dictionary  Schema 

Data  Dictionary  denotes  the  organized  and  structured 
collection  of  meta-data  which  comprises  the  contents  of  the 
DDS.  The  data  dictionary  schema  denotes  the  logical  struc- 
ture of  the  data  dictionary,  in  a  manner  analogous  to  the 
use  of  same  term  in  the  context  of  a  DBMS. 

The  structural  characteristics  of  data  dictionary 
and  the  contents  of  data  dictionary  schema  will  determine 
what  kinds  of  meta-data  can  be  stored  in  the  data  dictionary 
and  what  kinds  of  relationships  can  be  established  between 
thera.  Seme  systems  have  extensibility  facilities  whereby  an 
installation  can  customize  the  data  dictionary  schema. 

The  schema  is  descriJ^ed  in  logical  terms  in  order  to 
gain  a  clearer  insight  about  what  kinds  of  meta-data  are 
supported  by  the  DDS.  This  logical  description  will,  of 
course,  be  quite  different  from  the  manner  in  which  these 
structures  are  actually  implemented  in  a  specific  system. 
This  description  should  be  made  independent  of  any 
implementation. 

A  data  dictionary  has  a  conceptual  similarity  to  the 
Entity-Relationship- Attribute  model.  The  basic  unit  in  the 
data  dictionary  is  a  Eictionary  Entity  or  Entity  for  short. 
Entities  represent  real  world  objects  or  things  about  which 
certain  information  exists  in  the  data  dictionary.  And  the 
information  about  entities  themself,  exists  in  the  form  of 
Attributes  which  generally  denote  the  qualities  or  quanti- 
ties of  properties  of  the  entities.  Finally,  data 
dictionary   also   contain   information   about   Relationships 


between  entities,  and  relationships  may  have  attrihutes 
assigned    to   them. 

The  term  Entity-type  is  applied  to  some  entities 
that  have  similarities  airiong  them.  For  example,  if  a  set  of 
files  is  described  in  the  data  dictionary,  each  file  will  be 
represented  by  a  distinct  entity.  It  then  becomes  useful  to 
establish  an  Entity-type  called  File  in  the  data  dictionary 
and  to  say  that  all  such  entities  representing  files  have 
the    entity-type    File.  Attributes    of    entities    of       the    same 

type  will  exhibit  a  certain  degree  of  similarity. 
Entity-type  File  will  likely  have  an  attribute  of  what  kind 
of  access  method  used,  and  maybe  another  attribute  shcwin^' 
the    blocking      factor    used.  These    both      access    method      anl 

blocking  factor  are  then  called  as  an  Attribute-t jpe  which 
is  associated  with  the  entity-type  File.  Beside  the  entity- 
type  File,  there  will  be  an  entity-type  Record.  The  informa- 
tion would  exist  in  the  data  dictionary  explaining  which 
types  cf  records  are  included  in  a  given  file.  All  such 
relationships  between  these  file  entities  and  their  associ- 
ated record  entities,  then  be  called  as  a  Relationship- type- 
In  conclusion,  the  data  dictionary  schema  would  be  viewed  as 
containing  all  existing  entity-types,  relationship-types, 
and  attribute-types.  Any  one  of  these  three  types  may  also 
be   referred   to    as   a    schema   descriptor. 

Every  entity  in  the  data  dictionary  has  a  primary 
name  which,  depending  on  the  particular  DDS,  will  be  unique 
either  in  the  dictionary  or  within  the  entity-type  to  which 
the  entity  belongs.  Some  systems  may  facilitate  duplicate 
user-supplied  names  by  assigning  them  distinct  sequential 
numbers.  In  this  case,  the  concatenation  of  user-supplied 
name  and  sequential  number  constitutes  the  unique  dictionary 
name.  The  allowable  length  of  the  primary  name  should  be 
sufficiently  large  enough  to  convey  the  meaning  of  an  entity 
in      its    primary      name.         It      is      common    that      at    least      seme 
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entities  will  also  be  known  by  other  nanies.  Such  alternate 
names  are  called  aliases  or  synonyms,  and  the  most  impcr-cant 
things  are  the  capability  of  the  DDS  for  tracking  them  and 
allowing  access  to  the  data  dictionary  via  these  alternate 
names.  Soaetimes  it  is  convenient  if  non-anigue  synonyms 
are  allowed.  To  fulfill  this  requirement,  DDSs  have  facili- 
ties for  tracking  synonyms  either  as  attributes  of  the 
respective  entities,  or  as  separate  entities  related  to  the 
primary  entity.  Therefore,  it  is  important  that  the  system 
should  be  able  to  recognize  the  context  in  which  the  synonym 
is    used. 

The  attributes  can  be  differentiated  into  some 
attribute-types,  among  which  are:  Description  =  it  consists 
of  an  English  language  statement  describing  the  meaning  of 
the  entity.  Classification  keywords  =  these  are  attached  to 
the  entities  which  then  can  be  used  for  selective  retrieval 
of  these  entities.  Audit-attributes  =  these  are  attributes, 
generated  by  the  DDS,  indicating  the  identification  cf  the 
person  who  created,  the  date  of  creation,  the  identification 
of  last  person  who  modified,  the  date  of  last  modification, 
and    the    total    numbers   of   modification,    all    for   each   entity. 

The  entity  itself  can  be  conveniently  separated  into 
three  entity-types:      Eata,    Process,    and    Osage  entity-types. 

a.      Data    Entity-types 

The  most  common  of  this  type,  listed  with 
typical    attribute-types   and    relationship-types   are: 

C)  Item/Data  Element.  In  some  systems  the 
lowest  entity-type  is  Item,  which  is  considered  to  be  the 
atomic  unit.  In  other  systems,  the  lowest  entity  nay  be 
Data  Element,  which  in  its  turn  it  may  contain  other  Data 
Elements.  This  is  usually  specified  by  contains  clause, 
which    expresses    the    relationship. 
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Commonly  provided  attribute-types  relate 
to  the  physical  characteristics  of  the  I  ten/mement , 
including  distinctions  between  Source,  Target,  Internal 
representations,  and  the  validation  criteria  that  may  be 
required   for   the   real-world    instances   of    the   Itera/21ement. 

(2)  Grou^/Record.  Systems  that  recognize  the 
Item  as  an  entity-type  will  contain  Group  as  a  separate 
entity-type,  whereas  systems  that  have  Data  Element  as  an 
entity-type  do  not  have  an  entity-type  for  Groups.  Record 
is  logically  the  same  as  a  Group,  therefore  separate  entity- 
types  for  both  of  them  may  or  may  not  exist.  A 
relationship-type  is  provided  to  express  the  structure  of 
the  Grcup/Eecord.  Commonly  available  attribute-types  relate 
to  the  manner  in  which  the  constituent  elements  are  aligned, 
and   other   physical   characteristics   of   the    entity. 

(3)  File.  Relationship- types  are  provided  to 
express  the  structure  of  the  file.  Attribute- types  relate  to 
the  access  method  us€d,  blocking  and  labelling  information, 
etc. 

(4)  DBMS-related  Entity-types.  The  entity- 
types  that  exist  are  dependent  on  the  specific  DBMS  for 
which  the  DDS  provides  support  services.  In  all  cases,  the 
entity-types  eguate  to  the  various  data  descriptions  used  by 
the  DBMS,  such  as  Schema,  Subschema,  Database  Directory, 
etc.  Relationship-types  and  attribute-types  are  provided  to 
allow    the  DDS   to    express    the    structure   of    these    entities. 

(5)  Other  Data  Entity-types.  Some  DDSs  offer 
Report,  Screen,  and  Form  entity-types.  In  each  case, 
relationship-types  are  provided  that  allow  the  contents  of 
such  entities  to  be  specified  in  terms  of  the  constituent 
element  s. 
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t.   Process  Zr.tity-t ypes 

There  are  two  most  common  Process  Entity- types. 
They  are  Progran/Module  and  System/Subsystem  as  will  be 
discussed  below. 

C)  Program/Module.  This  entity-type  repre- 
sents ir.formation  about  a  collection  of  executable  code. 
Typical  attribute-types  are  the  language  of  the  source  code, 
the  size,  and  the  characteristics  under  which  it  operates. 
Eelationship-types  are  provided  to  other  Programs/Modules, 
as  well  as  the  data,  i.e.  databases,  files,  and  elements,  on 
which  it  operates.  Generally  speaking,  different 
relationship-types  are  provided  for  input,  output,  and 
processing- in- pi ace. 

(2)  System/Subsystem.  This  entity-type 
descrihes  a  collecticr  of  programs  and/or  Modules  associated 
with  a  major  function  of  the  enterprise.  Relationship- types 
are  provided  to  associate  a  System  with  Subsystems,  as  well 
as  the  constituent  Prcgrams. 

c.   Usage  Entity-types 

Users  and  their  organizational  environment,  and 
the  data  comaunicaticn  environment  can  be  thought  of  as 
Usage  (or  External)  Entity-types.  They  are  not  directly 
components  of  a  system,  such  as  data  and  processes,  but 
nevertheless  play  an  important  role  in  its  operation. 

The  User  and  organizational  component  entity- 
types  may  have  relationship- types  that  allow  users  to  be 
associated  with  organizational  components.  These  components 
themselves,  and  selected  relationship-types  that  associate 
users  or  components  with  data  and  process  entities  may 
describe  the  responsibilities  assigned  to  those  users. 

The  example  of  Data  Communication  environment 
entity-types  are  Terminals,  Messages,  and  other  entity-types 
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that  d€5cribe  the  communication  networks.  Their 
relationship-types  may  provide  the  associations  of  such 
entities  vith  other  usage  entities,  i.e.  a  given  terminal  is 
assigned  to  a  certain  set  of  people  or  a  certain  ciganiza- 
tional  unit.  Or  it  cay  provide  such  associations  with  data 
and  process  entities,  i.e.  a  given  terminal  is  authorized 
to  execute  only  a  given  set  of  transaction  programs,  cr  to 
access  only  certain  Files  or  Databases.  It  will  be  appro- 
priate tc  note  here,  "that  the  role  of  the  DDS  in  these 
matters  is  strictly  a  repository  for  documentation,  and  that 
the  DDS  by  itself  cannot  be  expected  to  enforce  such 
restrictions  and  limitations.  In  order  for  DDS  to  be  used 
for  enforcement,  appropriate  "active"  interfaces  would  have 
to  exist  to  assure  that  the  restrictions  and  controls  which 
are  documented  in  the  data  dictionary  are  always  invoked  at 
execution  time.  It  will,  on  the  other  hand,  create  more 
complexity  and  much  overhead. 

3.   Extensi.bilit Y  Facilities 

The  concept  of  extensibility  facilities  is  to  allow 
an  installation  to  modify  the  system-standard  schema  as 
delivered  by  the  DDS  vendor.  Any  new  schema  descriptor 
created  through  the  use  of  extensibility  facilities  will  be 
referred  tc  as  an  extensibility  descriptor. 

Extensibility  facilities  are  extremely  powerful,  and 
their  usage  should  be  done  with  great  care  because  exten- 
sions to  the  system- standard  schema,  once  they  are  used,  can 
only  be  undone  with  some  difficulties.  Additionally, 
changes  to  the  systen  may  create  confusion  among  the  users 
of  the  DDS  as  well,  and  decrease  their  confidence  in  the 
system.  Due  to  these  reasons,  it  is  recommended  that  their 
usage  should  be  restricted  to  the  Dictionary  Administrator, 
the  person  who  is  responsible  for  DDS  function,  i.e.,  the 
recording   of  all   meta-informations  and   meta-data  and   its 
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maintenance    through    the      use    of    the    DDS,         along    with    making 
its    facilities    availatle    to    the    users   of    the   system. 

There  are  three  kinds  of  such  facilities  that  exist 
in    current    DDSs    : 

a.  Entity-type  extensibility:  the  ability  to  add  new 
entity-types   to    the    dictionary. 

b.  Attribute-type  extensibility:  new  attribute- types 
for  either  entity-types  or  relationship-types  can  be 
declared   using ^this    facility. 

c.  Relationship-types  extensibility:  this  facility 
allows    the    installation    to   declare   new   relationship- types. 

^  •      Status    Facilities 

These  facilities  allow  the  DDS  to  be  used  in  a 
System  Life  Cycle  environment  where,  for  instmcc/  a  certain 
entity  may  both  part  of  a  production  system  and  a  new  test 
system.  Due  to  its  intended  usage,  it  is  preferable  to 
maintain  the  same  name  for  the  entity  in  different  stages. 
Therefore,  a  facility  is  required  that  will  allow  two 
distinct  dictionary  entities  to  have  the  same  narr.e,  yet 
different  attributes  and  relationships.  In  some  systems, 
such  entities  can  be  distinguished  by  assigning  different 
version  numbers  to  them.  In  systems  having  a  status 
facility,    it   can    be    accomplished    either    by: 

a.  Appending  the  entity-status  to  the  entity-name 
which    provides    the    uniqueness    of    the    name. 

t.  Logically  partitioning  the  dictionary  into  sepa- 
rate databases  for  different  statuses,  and  requiring 
uniqueness    of    the    naie    only    within   each    partition. 

5  •      Cictionar^   Commands 

A  DDS  may  have  one  or  more  interfaces  that  allow  a 
user  to  interact  with  the  dictionary.  Such  an  interface  may 
in    the    form    of    : 
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•  A  command  language. 

•  A  screen-oriented  interface. 

•  A  fixed  fornat  batch  data  entity  facility. 

•  A  programmatic  interface  that  allows  user  written 
application  programs  to  access  the  dictionary. 

A  screen-oriented  interface  is  more  user-friendly 
compared  to  the  others.  It  results  in  higher  utilization  of 
computer  system  resources,  but  makes  the  DDS  available  to  a 
larger  class  of  users.  Another  benefit  may  be  that  the 
error  rate  is  substantially  reduced. 

The  dictionary  commands  may  be  differentiated  into 
eight  categories  on  the  basis  of  their  functionality: 

a.  Dictionary  Maintenance  Commands:  this  enables  an 
installation  to  create  and  maintain  its  data  dictionary. 

b.  Reports  and  Queries  Commands:  an  installation 
may  generate  reports  using  meta-data  contained  within  the 
data  dictionary  using  these  commands. 

c.  Data  Structure  Interface  Commands:  enable  ether 
systems  to  use  the  descriptions  of  data  structures  contained 
within  the  dictionary, 

d.  Extensibility  Commands:  vehicle  to  exploit  the 
extensibility  facilities. 

€.  Status-related  Commands:  the  ability  to  distin- 
guish entities  in  different  stages  of  the  life  cycle. 

f.  Security  Commands:  used  to  allow  security  decla- 
rations to  be  assigned  to  the  dictionary. 

g.  Dictionary  Processing  Control  Commands:  used  to 
control  the  dictionary  process  such  as  logon-logoff, 
processing  defaults,  etc. 

h.  Dictionary  Administrator  Commands:  exclusive 
commands  especially  designed  to  be  used  by  the  dictionary 
administrator. 
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6  •   2r id2e  Facilities 

Other  facilities  of  a  DDS  exist  in  the  form  of 
bridges  cr  interfaces  to  other  systems.  The  contents  of  the 
dictionary  may  be  made  available  as  part  of  the  processjing 
functions  of  that  system.  In  each  case,  these  systems 
provide  tools  whose  functionality  is  outside  the  DDS  but 
which  require  data  about  entities  which  can  be  expected  to 
exist  in  the  dictionary.  By  accessing  a  dictionary  and 
extracting  the  required  information,  the  disadvantages  of 
having  to  store  and  maintain  redundant  data  is  eliminated. 

Those  such  interfaces  are  : 

a.  Report  and  Query  System:  the  ability  to  support 
various  kinds  of  reports  and  queries. 

b.  Validation  Criteria:  support  other  systems  by 
providing  a  module  which  performs  the  specified  validation 
and  which  can  be  inserted  into  a  program. 

c.  Database  Design:  the  ability  to  provide  basic 
data  needed  by  automatic  database  designers. 

a.  Test  Data  Generation:  support  test  data  genera- 
tion by  providing  descriptions  of  the  structures  and  formats 
of  the  files  and  databases. 

7.   EDS  Security 

As  mentioned  before,  the  term  DDS  security  is 
applied  here  to  denote  the  security  of  the  DDS  itself. 
Entities  in  the  dictionary  may  have  attributes  describing 
access  characteristics  to  the  real-world  instances  of  these 
entities,  but  this  data  is  entirely  informational  in  nat'jre 
and  cannot  be  enforced  by  the  DDS,  since  the  system  is  not 
part  of  the  loop  in  the  execution  of  programs  against  the 
"real  data".  On  the  other  hand,  unauthorized  access  tc  the 
DDS  can  be  eliminated  by  applying  such  security  procedures, 
e.g.,   the   assignment  of   passwords,   or   the  inclusion   of 
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security  levels  which  control  the  various  kinds  of  access  to 
dictionary  entities.  To  gain  more  integrity  and  reliaDlity, 
the  security  of  the  EES  must  be  considered  to  be  related  to 
the  security  of  the  entire  computer  system.  The  level  of 
security  existing  in  the  computer  system  is  influenced  by 
the  security  of  the  lasic  systems  software  and  the  physical 
security  of  the  installation,  as  veil  as  the  procedures  used 
by  th€  personnel  of  the  installation.  These  latter  are  often 
notably  lax,  and  it  is  not  at  all  unusual  to  observe  cases 
where  passwords  are  not  kept  confidential  and  may,  indeed, 
openly  be  shared  with  unauthorized  people. 

D.   COST/BENEFIT  ANAIISIS  FOR  DDS 

As  applied  to  any  system  acquisition,  Cost/Benefit  anal- 
ysis should  be  done  prior  to  and  in  order  to  get  a  justiti- 
cation  for  DDS  acquisition,  implementation,  and  usage.  Tho 
following  list  of  costs  and  savings  represent  tangible  items 
that  can  be  used  in  assessing  the  costs  and  iDenefits  for  an 
economic  study  of  the  feasibility  of  implementing  a  DDS. 

1 .   Costs 

There  are  eight  possible  costs  which  may  be  consid- 
ered : 

a.  Acquisition  cost  is  the  accumulation  of  lease  or 
purchase  cost  and  the  maintenance  cost  of  the  system. 

b.  Data  Administration  staff  cost  is  self  explana- 
tory. 

c.  Hardware  Cost  is  the  sum  of  storage  device  cost 
and  CPU  time  cost. 

d.  Start-up  cost  is  the  total  cost  of  training  data 
administrator  staff  and  all  activities  such  as  developing  a 
comprehensive  plan  (see  Table  I  as  an  example)  that  should 
be  done  in  any  data  dictionary  implementation. 
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TABLE  I 
Comprehensive  Plan  for  DDS*s  Usage 

1.  Eevelopment  of  a  policy  for  the  use  of  the  CDS. 

2.  Development  of  standards  to  be  followed  in  the 
dictionary,  including  naming  conventions  for 
dictionary  entities. 

3.  Development  of  decisions  on  how  to  use  the  control 
facilities  of  the  DDS,  such  as  the  status  and 
security  facilities, 

4.  Delegation  of  authority  and  responsibility  for  th€ 
use  of  various  DDS  facilities. 

5.  Definition  of  procedures  for  the  use  of  the  DDS. 
and  development  of  the  re-juired  policies  to  imple- 
ment these'procedures, 

6.  Design  and  implementation  of  customized  features 
for  the  DDS,  should  any  be  required.  This  may 
include  change  to  the  dictionary  schema  to  allcv 
new  tvpes  of  information  resources  to  Le  stored 
in  the  dictionary,  production  of  specialized 
reports,  or  interfaces  to  other  S/w  systems. 


e.  Data  collection  cost  is  a  function  of  th€  number 
of  entities,  attributes,  and  relationships  which  are  to  be 
put  in  the  dictionary. 

f.  Haintenace  cost  will  depend  on  the  degree  of 
changes  to  the  application  system  or  systems  contrcllea  by 
the  DES. 

g.-  Application  system  change  cost  is  a  cost  perti- 
nent to  any  change  to  the  application  systems  due  to  the 
implementation  of  the  dictionary  for  reasons  of  efficiency, 
integrity,  and  maintainability. 

h.  User  education  cost  is  the  cost  for  training 
people  involved  in  data  dictionary  usage  in  addition  to  the 
data  administrator  staff. 
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2-   Factors  for  Estimating  Savings  and  Benefits 

There  are  four  factors  which  can  be  used  as  an  aid 
in  estimating  the  quantification  of  savings.  The  greater  the 
degree  to  which  these  four  factors  are  held  to  apply  to  the 
enterprise  and  its  operations,  the  more  the  high  end  of 
estimated  saving  and  benefits  can  be  expected.  Those  four 
factors  are: 

a.   The   Maturity    of   an    Information   Processing 
Environment 

This  is  a  major  factor  in  the  benefits  that  car. 
te  attained  with  a  DDS.  Increased  maturity  will  help 
substantially  in  the  integration  of  dictionary  facilities 
into  the  operations  of  the  enterprise. 

t.   The  Complexity  of  the  Environment 

The  number  of  data  elements,  files,  databases, 
and  programs  can  be  used  to  measure  how  complex  the  informa- 
tion processing  environment  is.  Problems  caused  by 
complexity  tend  to  worsen  geometrically  with  the  number  of 
such  elements,  files,  databases,  and  programs.  The  value  of 
a  DDS  will  be  greater  as  complexity  increases. 

c.   The  Degree  of  Data  Sharing 

It  should  be  common  practice  that  data  elements 
are  shared  by  different  programs,  where  some  of  these  are 
from  different  systems.  An  important  issue  is  that  changes 
in  one  part  of  the  system  tend  to  have  effects  in  many  ether 
parts  of  the  system.  Failure  to  compensate  for  such  changes 
can  cause  production  failures  and  unanticipated  costs. 
Tracking  the  effect  of  these  changes  is  a  valuable  feature 
of  a  EDS.  It  is  made  possible  by  evaluating  the  attributes 
and  relationships  of  changed  entities  as  the  basis  for 
further  tracking. 
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d.   Personnel  Turnovers 

The  personnel  associated  with  information 
processing  systems  is  either  data  processing  organization 
personnel  or  user's  organization  personnel  used  to  deal  with 
the  information  processing  system.  In  this  regard,  a  DDS 
offers  two  advantages:  firsts  information  which  otherwise 
might  be  stored  in  the  minds  of  individuals  and  which  may  be 
lost  to  the  organization  with  the  loss  of  the  individual  is 
now  placed  in  the  DDS.  Secondly,  the  learning  curve  for  new 
personnel  is  steeper  than  it  would  be  without  the  use  of  a 
EDS. 

^  •   Savings  and  Benefits 

There  are  five  areas  in  which  savings  and  benefits 
may  be  expected.  The  four  factors  mentioned  in  the  previous 
section  can  be  used  as  aids  in  predicting  specific  monetary 
savings  in  each  of  these  following  areas: 

a.  System  design  and„  development:  the  prime  advan- 
tage of  the  DDS  is  in  its  use  for  better  communication 
between  users  and  implementors.  This  results  in  fewer 
changes  or  iterations  and  consequently  faster  prograsniinc 
because  the  specification  is  better  documented  and 
understood  by  all  parties. 

t.  System  maintenance:  better  and  more  complete 
documentation  in  the  dictionary,  the  ability  to  analyze  the 
effect  of  proposed  changes,  and  the  improved  communication 
between  users  and  maintenance  programmers  on  proposed 
changes  or  corrections. 

c.  Data  redundancy:  reduction  of  unplanned  data 
redundancy  will  result  in  an  improved  system  which  has 
greater  integrity  and  better  operability  as  well  as 
potentially  decreased  reguirements  for  random  storage 
devices. 
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d.  Database  creation:  may  take  advantage  cf  the 
descriptions  contained  within  the  DD3  to  reduce  iterations 
in  the  design  process  and  faster  concurrence  by  all  parties 
on  the  contents  of  a  database. 

e.  Improved  Communication:  self  explanatory. 
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17.  INITIAL  DATA  EICTIONARI  DESIGN  FOE  DISPULLAHTAD 

A.  GEUEFAl 

As  mentioned  before,  the  initial  design  of  Data 
Dictionary  will  te  limited  to  four  application  areas: 
personnel,  payroll,  intelligence  personnel,  and  territorial 
personnel.  This  design  is  a  first  step  in  the  Stage 
Development  approach  used,  therefore  it  may  be  exp-anded  ir. 
the  future. 

B.  DATA  DICTIONARY  SCHEKA/SOBSCHEaA 

In  a  manner  analogous  to  the  context  of  a  DB:^S,  Data 
Dictionary  Schema  denotes  the  logical  structure  of  the  data 
dictionary  (DD) .  In  this  regard,  then,  the  term  Subschema 
will  dencte  a  subset  of  the  schema  to  be  seen  by  a  given 
application  (process)  or  user  [Ref.  6],  and  it  is  compatible 
with  application  views  of  a  database  [Ref-  7]. 

1  •   5^1§  Dictionary  Schema 

The  structural  characteristics  of  the  DD  and  the 
contents  of  the  DD  schema  are  important  aspects  of  the  usage 
of  a  DDS,  since  by  evaluating  these,  users  may  know  what 
kinds  of  meta-data  and  relationships  between  them  exist 
within  the  DD.  As  suggested  by  Lefkovits  et  al  [Ref.  8], 
the  structural  characteristics  of  a  DD  may  be  described  in 
logical  terms  in  order  to  gain  a  clearer  insight  of  what 
kinds  of  meta-data  are  supported  by  the  DDS.  The  logical 
structure  of  a  typical  DD  drawn  by  Allen  et  al  [Ref.  5]  seem 
to  be  appropriate  for  DISPULLAHTAD* s  DD  with  only  minor 
changes,  and  it  is  presented  as  figure  U.I. 
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Figure  4,1    Logical  Structure  of  DISPOLlAHTAD's  DD 
(adapted  from  Allen  et  al  [Sef.  5]). 


There  are  three  kinds  of   entities  (from  the  left  to 
the  right)  within  the  logical  DD  in  figure  4.1  : 
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a.   Data  Entities 

These  consist  of  database,  subschema,  relaticn- 
ship,  file^  group  of  elements,  and  data  element  entities. 
Wherein  each  record  may  consists  of  some  elements,  tut  may 
or  may  not  have  group  of  elements  in  it.  File  and/or  record 
entity  is  the  subject  of  a  process  entity,  ;»hile  data 
element  entity  is  the  subject  of  a  transaction  or' a  report 
entity. 

t.   Process  Entities 

A  process  entity  may  be  an  application  crcgram, 
a  program  module,  or  a  system/subsystem  that  typically 
generates  a  report  or  does  a  transaction  involving  either 
data  eleient  or  a  group  of  data  elements. 

c.   Usage  Entities 

Included  in  this  category  are  user  and  their 
organizational  environment  (such  as  processors  and  teriri- 
nals)  ,  and  data  communication  environment  (such  as 
communication  network,  communication  nodes,  messages,  etc.). 

2  •   ili§.  Dictionary  Subschemas 

Since  there  are  four  applications  included  in  this 
initial  design,  there  might  be  four  subschemas  accordingly. 
Due  to  the  relatively  small  amount  of  meta-data  that  will  be 
stored  in  the  initial  dictionary,  however,  and  in  order  to 
reduce  complexity,  it  is  better  not  to  apply  subschemas  at 
this  point.  Later,  if  the  dictionary  has  grown  substan- 
tially, and  many  applications  have  been  added,  the 
subschemas  may  be  applied  in  order  to  improve  efficiency  and 
security. 
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C.       DISPDLLAHTAD'S    DA1A    DICTIONARY 

The  design  of  this  DD  is  based  on  current  applications 
for  which  the  DBilS  has  not  been  implemented  yet.  The 
folloving  tables  present  the  lists  of  Data,  Process,  and 
Usage  entities  and  one  or  two  of  the  actual  instances  of 
DISPDILAHTAD»s    data    dictionary. 

1 ,      Entities 

a.  Data  Entities 

Table  II  summarizes  data  entities  abstracted 
from  the  personnel  application  [Ref.  9  and  P.ef.  10].  Table 
III  presents  data  entities  for  the  payroll  application  [Eef. 
11  and  Eef.  12].  lata  entities  used  in  the  intellicer.ee 
personnel  application  is  presented  as  Table  IV  [P.ef.  13  and 
Pef.  14].  Finally,  data  entities  from  the  territorial 
personnel  application  are  shown  in  Table  V  [Ref.  15]. 

b.  Process  Entities 

Process  entities  belonging  to  the  four  applica- 
tions are  presented  as  Table  71  .  This  table  lists  only 
routine  processes,  not  irregular  processes,  since  the  latter 
ones  are  not  yet  standardized  (they  are  done  as  requested). 

c.  Usage  Entities 

Table  VII  presents  user  entities  for  the  four 
applications. 
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2 .   Relationships 

The      relationships    can   be    represented   hy    the 
follcving  relations: 

EELA TIC H SHIPS (relatioD_name, key, com posit e_ key, se con dary_key, 

~     otner_entity_names) 
key 

a.   Relation  Kame 

In  pre-Dr-MS  situation,  this  may  be  filled  with 
the  record  entity  name  to  represent  the  relationships 
tetween  element  data  entities.  In  the  case  where  a  D3:iS  has 
been  implemented,  it  should  be  filled  with  the  relation  name 
since  a  logical  record  may  consist  of  some  relations  in 
order  to  reduce  complexity  and/or  fulfill  the  five  normal 
forms, 

t.   Key 

This  is  ar  entity  name  that  is  used,  as  the  key 
for  both  storing  and  accessing  the  relation.  If  the  relation 
uses  composite  keys,  this  will  be  the  first  part  of  the 
composite  keys  where  the  second  part  will  be  stored  as 
composite-key  attribute. 

c.  Composite  Key 

This  is  the  entity  name  used  as  the  second  part 
of  a  composite  key.  If  primary  key  is  not  composite,  this 
attribute  will  be  filled  with  "NONE". 

d.  Secondary  Key 

This  is  the  entity  name  used  as  the  secondary 
key.  If  the  relaticn  has  no  secondary  key,  this  attribute 
will  be  filled  with  "NONE". 
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€.   Other  Entity  Names 

The  names  of  other  entities  (except  keys)  in  the 

relation  will  be  filled  by  this  attribute.  An  ampersand  sign 

(  S  )   will  be  used  between   two  entity  names  and  a  sentence 

of  "EZPETITIONS  OF"  will  be  written   in  front  of  a  repeating 

group  of  elements. 

3 .   Attributes 

Every  entity  has  information  attached  to  it  called 
attributes.  In  the  following,  all  information  that  lay  be 
included  as  attributes  will  be  discussed. 

a.   Data  Entities 

There  will  be  several  pieces  of  inf craiation 
included  for  each  data  entity  (either  file,  record,  or 
element  entity) .  Since  the  DD  typically  be  implemented  as 
one  of  the  DBMS  application,  these  can  be  represented  as  a 
relation  of: 

PILE_ENTITY (entity_naae, block  size, access_method, logical 

recorof  size,physical_storage  device) 
key ""  ~ 

EECOED^ENTITY (entity  Lame, length, fi2ed_variable_code, key, 

"^     composite  key,  secondary_key- 
updating_'Eime,  updating_mode) 
key 

ELSaEliiT_ENTITY  (entity  name,  length-code, source,  user, 

"  definition) 
key 

C)   Entity  Name. 

The  name  of  the  entity  is  limited  to  a 
maximum  of  eight  characters,  this  may  be  a  combinaticn  of 
alphabetic  and  numeric,  but  should  contain  an  alphabetic  as 
its  first  character. 

(2)   Block  Size. 

Self  explanatory. 
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(3)   Acc€£S  Method. 

This  may  be  encoded  as: 
£   -   Sequential 
I   -   Indexed  Sequential 
E   -   Direct  Access 
1     -      Virtual  Storage  (VSAM) 
(^)   Logical  Record  Size. 

This  is  equal  to  record  length. 

(5)  gh^gical   Storage    Device. 

This  may  be  coded  as  the  following: 
TAPE  -   Magnetic  Tape 
DISK  -   Magnetic  Disk 
DPDH  -   Magnetic  Drum 
FICP  -   Floppy  Disk  /  Diskette 
CARD  -   Punched  Card 
PAPR  -   Paper  Tape 

(6)  Lixe^  Z  Variable  Code. 

The  codes'  used  are: 

FIX  -   Fixed  Length  Record 
VAR  -   Variable  Length  Record 

(7)  Kei. 

This  is  an  entity  name  used  as  the  key  for 
both  storing  and  accessing  the  relation.  If  the  relation 
uses  composite  keys,  this  will  be  the  first  part  of  the 
composite  keys  where  the  second  part  will  be  stored  as 
composite-key  attribute. 

(8)  Compcsite  Key. 

This  is  the  entity  name  used  as  the  second 
part  of  composite  key.    If  a   primary  key  is  not  composite, 
this  attribute  will  be  filled  with  "NONE". 
(^)   Secondary  Key . 

This  is  the  entity  name  used  as  the  secon- 
dary key.  If  the  record  has  no  secondary  key,  this 
attribute  will  be  filled  with  "NONE". 
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(10)  U£datin£  Time. 

This  attribute  contains  a  description 
about  how  often  this  record  will  be  updated.  This  will  be  a 
number  of  days. 

(11)  22 dating  Mode. 

This  attribute  contains  description  about 
how  the  updating  is  dene.  It  is  encoded  as  follows: 

BATCH   -   Batch  Processing 
ONLIK   -   Online  Processing 
BOTH    -   Both  of  Batch  and  Online 

(12)  Entity  Length. 

This  denotes  the  length  of  an  entity  (uaY 
be  record,  or  element)  .  In  the  case  of  variable  record, 
this  information  will  be  filled  with  zeroes,  since  the 
length  of  each  record  will  be  attached  within  the  record 
itself. 

(13)  Entit;^  Code. 

This  is  a  code  for  the  character  type: 
A   -   Alphabetic 
N   -   Numeric 
AN  -   Aphanumeric. 
(1U)   Source. 

This  denotes  the  organizational  entity 
responsible  for  providing,  updating,  and  deleting  the 
entity. 

(15)  Users. 

This   denotes   the   organizational  entity 
(entities)    allowed  to   retrieve  and   use   the  entity.    If 
subschemas  are  applied  to  the  DD ,  this  attribute  will  not  be 
necessary. 

(16)  Definition. 

This  provides  a  detailed  narrative 
describing  the  entity.  This  may  include  the  information 
about  f reguency_of_update,  range_of _acceptable  values,  and 
so   on . 
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(17)   Alias. 

At  the  present  tiae,  every  entity  name 
(especially  element  entity)  has  no  alias  attribute  attached 
to.  In  the  future,  this  attribute  sure  will  be  needed.  One 
way  to  accomodate  this  need  is  by  add  another  relation 
called  alias  relation  in  which  has  at  least  two  attributes: 
entity-name  and  its  alias-name. 

h.   Process  Entities 

The  inforaation   included  as   attributes  in   the 

process  entity  are:  name,  input_antity ,  output_entit y,  and 
the  description  of  the  process.  These  can  be  represented  as 
the  relation  of: 

PROCESS  (process  name, input  entity, output_entity, description^ 
~     cf_process, aescr_of_input, input  media, 
descr_of_output, output_meaia)   ~ 
key 

But,  since  a  process  may  have  more  than  one  of  either  incut 
or  output  entity,  this  relation  should  have  a  composite  keys 
rather  than  having  only  the  name  as  its  single  key.  Because 
only  one  input/output  entity  is  allowed  in  every  instance  of 
PROCESS  [Eef.  6],  processes  having  more  than  one  input/ 
output  entity  may  waste  storage  since  all  attributes  will 
appear  unnecessarily  more  than  once.  In  the  case  that  this 
relation  has  a  composite  key,  a  query  asking  which  data 
entities  are  input  (or  output)  to  a  given  process  also  poss- 
esses a  difficulty  since  this  relation  can  not  be  retrieved 
using  only  the  process_name  (it  must  be  retrieved  using  its 
composite  key,  instead).  In  order  to  make  a  better  payoff, 
those  attributes  may  be  arranged  using  the  following  three 
relations: 

PROCESS  (process_na me, description) 
key  

PROCESS_INP0T (process_naBe, input  entity, description, 

~       media) 
key 
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PROCESS  OUTPUT  (proc€SS_name, output_entity, description  , 
"  media) 
key 

C)   ££2^§§§  Name. 

Process  name  will  be  limited  to  maximum  of 
eight  characters  as  applied  to  data  entities. 

(2)  Description  of  Process. 

Self  explanatory. 

(3)  Input  Data. 

The  input  data  may  be  either  a  file, 
record,  group  of  elements,  data  element,  or  data  entered 
from  console. 

(4)  Description  of  Input. 

This  attribute  may  be  filled  with  a 
description  such  as  the  input  data  is  "sorted  by  RANK"  for 
instance. 

(5)  IHfiiit   Media. 

As  to  physical  storage  device  attribute, 
this  attribute  may  be  filled  with  input  storage  device,  or 
data  entered  via  console. 

TAPE  -   Magnetic  Tape 

DISK  -   Magnetic  Disk 

DEUM  -   Magnetic  Drum 

FLCP  -   Floppy  Disk  /  Diskette 

CAPD  -   Punched  Card 

PAPR  -   Paper  Tape 

ecus  -   Data  Entered  via  Console 

(6)  Oiitput    Data. 

The  output  data  may  be  magnet ic- data, 
displayed  data,  or  printed  material. 

(7)  Description  of  Output. 

This  attribute  may  be  filled  with  a 
description  such  as  the  output  data  is  "sorted  by  KOTAMA" 
for  instance. 
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(8)      IS-Eiil  Hedia. 

This  attribute  may  be  filled  with  output 
storage  device  or  printed  output  material  and  these  are 
encoded    as    follows: 

TAPE   -      Magnetic    Tape 

DISK   -      Magnetic    Disk 

DEUM   -      Magnetic    Drum 

•FICP   -      Floppy   Disk   /    Diskette 

CABI)   -      Punched    Card 

PAFE    -      Paper    Tape 

PEIN   -      Printed    Material 

c.      Dsage   Entities 

The  information  pertinent  to  these  entities  have 
teen  included  and  can  be  derived  from  data  entities  in  terms 
of  who  is  responsible  for  update  operations  and  who  is 
allowed  to  retrieve  a  data  entity.  Here,  this  information 
will  be  stated  again  from  the  reverse  point  of  view,  that 
is,  which  data  are  the  responsibility  of  this  entity,  and 
which  data  are  allowed  to  be  retrieved  by  this  entity. 
Information  that  will  be  included  as  the  usage  entity's 
attributes      are:  user_name,         description      of      the      user, 

entity_name,  and  tyfe_of_access.  for  reasons  similar  to 
those  discussed  concerning  the  process  entities,  since  a 
given  user  may  have  more  than  one  data-entity  as  its  respo- 
sibility  and/or  to  be  retrieved  to,  these  attributes  may  be 
represented    by    the    following    three   relations: 

USER  (user_name, description) 
—    key   — 

DSER_ ACCESS (user_name- entity_name, ty pe_of_access) 
key 

0SER_EESPONSIBILITI(user_naiDe.entity_name) 

Key 
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C)       User  Name. 

As  with  the  data  entity  and  process  entity 
names,  the  usage  entity  name  will  be  limited  to  a  laxiauir  of 
eight   characters,    toe. 

(2)  Entity  Name. 

Entity_naffle  here  means  a  data_entity_rame. 

(3)  User  Responsibility. 

This  may  be  a  list  of  file,  record,  group 
of  element,  or  data  element  entities  that  are  this  user's 
responsibility.   This  can  be  viewed  as  subschema. 

(4)  User  Access. 

This  may  be  a  list  of  file,  record,  group 
of  element,  or  data  element  entities  that  may  be  retrieve  by 
this  user.  This  can  be  viewed  as  subschema,  and  cay  or  iray 
not  be  same  as  the  list  of  user  responsibility  items. 

(5)  T2£€  of  Access. 

The  type_of_access  is  either: 
E   -   Read  only 
D   -   Update  only 
E   -   Both  R  and  a 
N   -   No  access 

(6)  Description  of  Entity;. 

Self  explanatory. 

d.  Summary  cf  Relations 

Table  VIII  summarizes  the  relations  of 
relationships,  data  entities,  process  entities,  and  usage 
entities. 

e.  Example  cf  Relations 

Table  IX  presents  examples  of  file,  record,  and 
element  entity  relations.  Examples  of  process  entity  rela- 
tions and  user  entity  relations  are  presented  as  Table  X. 
finally,  an  instance  of  the  relationship  relation  is  shewn 
in  Table  XI. 
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4  .   Example  of  Data  Diet ionary' s  Queries 

r^eta-data  contained  within  data  dictionary  may  be 
used  to  answer  questions  asked  by  top  managers,  users,  or 
technical  staff  (such  as  system  analysts,  programmers,  and 
operators) .  In  the  following,  several  queries  and  the 
corresponding  responses  will  te  presented. 

a.   Top  Management  Queries 

Top  management   may  ask   a  question   like:   "How 
often  is  the  personnel  masterfile  updated  ?" 
Possible  answer  is: 
PEESFILE   is  updated  once  every  30  days. 

t.   User  Queries 

The  user  responsible  for  personnel  management 
may  ask  the  following  question:  "I  need  a  list  of  perscnr.ei 
having  rank  of  captain  who  speak  French  fluently  and  ar^ 
experienced  in  the  intelligence  field.  Is  DI3PQLL AHTAD  aLle 
to  provide  these  data  ?". 

Possible  answer  is: 
See  entities: 

1-  PANGKAT   (rank)  in  relation  PEP.SINTL 

2.  ASING     (foreign  language)     in  relation  PEfiSINTl 

3.  AKPASING  (active/passive  code)  in  relation  PERSINTL 

c.   Technical  Staff  Queries 

"What  are  the  inputs  and/or  outputs  of  process 
DPPS17  ?",  is  one  possible  question  asked  by  an  operator  for 
instance. 

Possible  answer  is: 
Process  DPPS17 
Input  is/are: 

1.  DAPOKDPP  (sorted  by  DPP    ),  media:  TAPE 

Output  is/are: 
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1.  DAPOKDPP  (sorted  by  KOTAMA  ),    media:  TAPE 

2.  KEKPANGN  (sorted  by  KOTAMA  ) ,  media:  DISK 

d.   Qnanticipated  Queries 

One  of  the  advantages  of  designing  the 
dictionary  using  the  relational  model  is  it  can  accommodate 
unanticipated  queries.  As  one  of  the  DBMS's  application, 
this  dictionary  supports  any  guery  expressible  in  a 
relational  guery  language  (e.g.  SEQUEL). 
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7.     THE    lilPlEMIfll^ION    OF    DATABASE    AT    DISPULLAHTAD 

A.  DATA    DICTIONARY    DESIGN    AS    A    STEPPING-STONE 

The  inplementat icn  of  a  database  at  DISPULLAHTAD  mav 
take  advantage  of  the  design  of  Data  Dictionary  in  the 
preceding  chapter  in  aany  ways: 

•  The  designed  DE  may  be  used  as  the  first  D3KS  applica- 
tion. Then,  the  experience  gained  here  can  be  used  in  the 
future  application  of  the  real  database  implementation, 

•  DD  as  a  repository  of  all  meta-data  will  provide  a 
full  specification  and  description  of  all  entities  and  rela- 
tionships between  then;.  Given  this,  the  implementaticn  of 
DBMS  will  be  faster  due  to  fewer  changes  and  iterations  in 
database  development, 

•  In  the  case  where  an  automatic  database  design  tool  is 
used,  such  as  DATA  IISIGNEE,  it  may  be  interfaced  with  the 
DDS  in  order  to  take  an  advantage  of  the  DDS  contents.  The 
database  designer  may  benefit  from  the  full  descriptions  of 
each  entity  contained  within  the  data  dictionary. 

In  this  regard,  the  design  of  DD  for  current 
DISPOIIAFTAD  applications  can  be  considered  as  a  stepping- 
stone  to  the  DBMS  i nplementation. 

B.  DISFDLLAHTAD'S    DATABASE    DESIGN 

From  Tables  II  through  V,  it  can  be  seen  that  there  are 
many  data  element  redundancies  within  the  four  applications. 
These  redundancies  reed  to  be  eliminated  by  designing  a 
database  fulfilling  the  five  normal  forms.  One  possible 
method  is  by  gathering  the  common  data  elements  in  ore  rela- 
tion, and  other  specific  data  pertinent  to  each  of  the  four 
applications  in  separate  relations.   The  specific  data  irodel 
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for  each  application  must  likely  consist  of  more  than  cne 
relation. 

In  designing  the  database,  availability  of  data  descrip- 
tion may  be  exploited  to  make  this  work  easier.  For  example, 
suppose  the  following  relationship  is  designed  in  the 
personnel  database: 

MAINPEBS  (nepers, nana, pangkat, corps,  jafcatan,satiniEkl) 
— key- 
Here,   the   possible  descriptions   contained  within   the 
dictionary  that  may  be  extracted  are: 

•  Which  files  and  records  would  have  to  be  accessed  in 
order  to  establish  an  instance  of  this  relation  ? 

•  What  is/are  tte  key/composite  keys  of  each  records 
derived  from  the  preceding  query  ? 

•  What  is  the  length  of  each  of  those  entities  ? 
Furthermore,  in  order  to  satisfy  the  five  normal  forms  a 

full  and  clear  description  of  each  entity  is  needed.  These 
descriptions  are  contained  within  the  data  dictionary.  The 
issue  of  actual  database  design  is  beyond  the  scope  of  this 
thesis  and  is  left  fcr  possible  follow-on  thesis  work. 

C.   CHOOSING  THE  DATA  DICTIONARY  SYSTEM  (DDS) 
1 .   Features 

The   available   commercial   DDS   have   most   of   the 
following  features  (see  Figure  5.2  for  more  detail). 

a.   Dictionary  Schema 

This  is  a  feature  used  to  generate  a  manufactur- 
er's standard  schema,  such  as  entities,  relationships,  and 
attributes. 
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h.       Schema    Extensibility 

This  is  a  feature  whereby  an  installation  is 
atle  to  customize  the  manufacturer's  standard  schema  by 
adding  to  it  new  entity-types,  relationship-types,  and 
at tribute- types. 

c.  Dictionary  Maintenance 

This  is  a  feature  that  enables  an  installation 
to  create  and  maintain  its  data  dictionary. 

d.  Eeports  and  Queries 

An  installation  may  generate  reports  using  meta- 
data contained  withir  the  data  dictionary.  A  DDS  provides 
these  abilities  via  these  features, 

/     e.   Bridge/Interface  Facility 

This  feature  generates  descriptions  from  the 
data  dictionary  needed  by  other  systems,  typically  an 
application  development  tool  such  as  DATA  DESIGNER. 

f.  Program  Access  Facility 

This  enables  an  installation  to  extend  the  func- 
tionality of  a  DDS,  i.e.,  the  preparation  of  programs  able 
to  access  data  dictionary  contents. 

g.  Status  Facility 

This  feature  provides  the  status  of  each  entity, 
especially  when  the  data  dictionary  is  used  in  the  system 
life  cycle,  by  providing  aids  in  application  development. 

h.   Security  Facility 

An  installation  may  restrict  access  to  the  data 
dictionary  to  authorized  personnel  only.  This  requirement  is 
fulfilled  by  this  feature. 


76 


2-   piSPULLAHTAD  *£  DDS  Rec uiremen ts 

a.  Data  Dictionary  Features 

There  are  five  data  dictionary  features  required 
in  order  tc  implement  DISPULLAHTAD* s  data  dictionary.  These 
are: 

•  Dictionary  Schema  which  is  needed  in  order  to 
generate  the  entities,  relationships,  and  attributes. 

•  Schema  Extensibility  which  will  be  needed  to 
accommodate  specific  needs  that  cannot  be  fulfilled  by  the 
dictionary  schema. 

•  DictioDary  Maintenance  which  is  used  tc  create 
and  maintain  the  data  dictionary. 

•  Reports  and  Queries  which  are  required  in 
order  to  generate  reports  from  and  to  extract  data  contained 
within  data  dictionary. 

•  Security  Facility,  even  though  the  data 
dictionary  has  no  actual  instances  of  "secured"  data,  it 
will  contain  information  about  such  data  that  can  be  used  to 
access  it,  i.e.,  entity_name,  access_method,  where  such  data 
are  stored,  etc.  Therefore,  a  security  facility  is  required 
to  add  and  strengten  the  level  of  security. 

Other  features  are  optional.  These  can  be 
considered  as  "nice  tc  have". 

b.  Active  Versus  Passive  Data  Dictionary 

A  full-active  or  partially-active  system  is  very 
desirable  because  it  provides  features  such  as  enforcing 
standards,  range_of_value  auditing,  transaction  monitoring, 
etc.  On  the  other  hand,  an  active  system  possesses  much 
overhead,  suffers  in  terms  of  longer  turn-around  time,  and 
requires  more  complex  processing  algorithms.  Therefore,  at 
this  time,  a  passive  data  dictionary  system  is  more  appro- 
priate  for   DISPULLAHTAD.    In  the   future,    after   enough 
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experience  has  been  gained,  a  partial-active  system  may  be 
applied  in  order  to  take  fuller  advantage  of  the  data 
dictionary. 

c.  Free-standing  Versus  DBMS-dependent  System 

A  free-standing  system  is  very  appropriate  for 
an  installation  having  different  DBMSs.  This  may  happen  in 
an  installation  with  databases  using  network  or  hierarchical 
structures  in  conjuction  with  newer  technology  such  as  the 
relational  system.  In  this  case,  all  systems  may  access  the 
data  dictionary  independently,  since  the  usage  of  the  lata 
dictionary  is  not  limited  to  any  one  system. 

On  the  other  hand,  a  data  dictionary  may  be 
implemented  as  one  of  the  DBMS  applications  (some  DDSs  are 
implemented  in  this  nanner)  [Ref.  8].  This  approach  is 
appropriate  for  an  installation  implementing  databases  using 
a  single  DBMS,  and  DISPOLLAHTAD  falls  into  this  category. 
Therefore,  a  DBHS-dependent  system  will  provide  more  advan- 
tages for  DISPOLLAHTAD,  e.g.,  it  can  be  used  as  a  training 
tool  in  implementing  the  database.  Another  possible  advan- 
tage is  that  if  DISPOLLAHTAD  should  change  from  a  passive  to 
an  active  system,  there  will  not  be  too  many  modifications 
required  because  the  data  dictionary  and  the  databases  are 
already  compatible. 

d.  Make  or  Buy 

Buying  an  available  commercial  system  may 
provide  a  high  quality  and  ready-to-use  system.  Figure  5.2 
summarizes  features  of  current  commercial  systems  that  may 
be  used  in  choosing  the  best  system  fulfilling  the  required 
features.  A  commercial  system  used  by  many  installations 
without  much  trouble  may  be  an  indication  that  it  has  a 
certain  quality.  Criteria  listed  in  Figure  5.  1  may  be  used 
in  selecting  the  best  system  for  DISPOLLAHTAD. 
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1.  It  should  have  at  least  the  following  five  features 
that  may  be  considered  as  the  primary  criteria: 

a.  Dictionary  Schema. 

b.  Schema  Extensibility. 

c.  Dictionary  :iaintenanc6. 

d.  Reports  and  Queries. 

e.  Security  facility, 

2.  The  following  four  criteria  may  be  considered  as 
the  secondary  criteria: 

a.  Compatible  with  current  hardware. 

b.  Coffloatible  with  applied  DBrTS. 

c.  Hign  quality  assurance. 

d.  Low  Acquisition  and  Set-up  Costs. 


Figure  5,1    Criteria  for  Choosing 
Commercial  Data  Dictionary  System, 


On  the  other  hand,  one  significant  advantage  of 
designing  a  DDS  in-house  is  that  the  system  can  be  fitted  to 
specific  requirement  of  the  installation.  Furthermore, 
given  that  DISPULLAHTAD  will  implement  a  DBilS,  the  rela- 
tional dictionary  presented  in  the  previous  chapter  may  be  a 
good  candidate  for  the  first  application.  By  implementing 
the  dictionary  as  a  relational  database,  only  one  system 
needs  to  be  acquired  (DBiiS)  instead  of  two  (DBMS  and  DDS)  . 

^ •   ReccmmendaticD 

For  the  reasons  discussed  in  the  previous  section, 
it  is  tetter  for  DISPULLAHTAD  to  implement  the  data 
dictionary  model  described  in  the  previous  chapter  as  the 
first  application  of  the  DBMS.  The  dictionary  will  be  a 
DBMS-dependent  system.  Initially,  the  system  should  be 
passive,  and  later,  if  appropriate,  it  may  be  changed  to  an 
active  system. 


79 


ST 

QH 
U 


CO 
Eh 

U 

3 


Q  0) 

X  c 

H 


-a: 


< 

-=1: 

«: 

\ 

\ 

\ 

23 

^ 

2', 

2: 


»* 
\ 

?: 


\ 


X 

1 

«     ^^ 

<a:     -C 

2:  1   U 

ca  U3 

1 

M  <i;ce; 

H-H 

-a: 

<ti 

U<H  (fl 

•H- 

* 

* 

* 

* 

* 

•K- 

\ 

N. 

H  D+p 

p>; 

C  &.(tl 

rtJeflQ 

H^ 

<:     ^ 

«     -u 

<p 

c 

Q-H 

-a: 

«: 

QrH 

* 

* 

•«• 

* 

•»• 

* 

■»• 

\ 

V 

MiH 

■ 

s 

s 

•3 

U 

**-* 

«^-» 

w    . 

o  u 

«:  c 

SM 

«: 

< 

* 

•«■ 

•»■ 

* 

* 

* 

» 

\ 

■«• 

S     * 

2 

fSSCU 

. 

l-MA 

<!s: 

t-5>— 

U^ 

1 

OS 

«3          1 

Npq 

* 

* 

* 

* 

•«• 

•K- 

* 

* 

\         1 

CPM 

W 

5 

s 

0-- 

1 

1 

•  • 

>-i 

>i 

u 

■M  (-^ 

j 

I-l 

u 

u3 

1 

Q) 

<: 

to  0 

] 

w 

(0 

(T3 

>l>1 

U 

>i 

>i 

«4-l 

0) 

G 

1    • 

G 

W  O 

^+-' 

\n3 

1   -P 

S4J 

'O  mt: 

M 

U  rtJ 

(0  U} 

U    • 

+J   0! 

•H-H 

a»H-( 

W-H 

(T3-H 

G  OJ  G 

3 

•H  S 

s  a 

•H-P 

M'H 

M-H 

crtM 

DiH 

MfH 

flCU  (H              1 

■^ 

+J  <u 

0)  Q) 

■P  C 

O  M 

S-H 

13  Oi 

■P-H 

crtH 

e 

a         1 

(0 

U^ 

^-»-l 

U-H 

l-L,(X> 

u  u 

•H-M 

(O  U 

o  u 

STD  S              1 

0) 

•H  U 

u  >^ 

•H  (T3 

0)  p 

U;  (0 

(-1  C 

■M  (T3 

M  (T3 

0  c  o 

(JH 

ac/i 

c/^w 

QE 

CXiO 

cot. 

CQH 

intx, 

Oifi-. 

U  03  0 

CO 

Q 

Q 


•H 
O    t 

a;  Q> 

BCC 
Bl_j 
O 
OrH 

n3 
•P 
flV 

M 

U  U) 
34J 
U-H 

> 
>H  O 

W  0) 
t-l 

3  a 

■M  O 
0)4-1 

<v 

N 

•  M 

in  (0 

a 
Q)  a 

M  3 
3  W 
cn— ' 
•H 


80 


VI.  CONCLOSION 

Inplementing  a  EEMS  is  a  "must"  for  DISPULLAHTAE  in 
order  to  control  the  proliferation  of  its  applications  that 
in  turn  raises  problems  of  data  redundancy  and  data  incon- 
sistency. Primarily,  a  DBMS  provides  data  manipulation 
capaiilities  whereas  a  data  dictionary  provides  management 
and  control.  Applying  management  and  control  (by  imple- 
menting DDS)  first  vill  make  the  job  of  database  design  and 
implementation  easier  in  term  of  lessening  the  difficulties 
and  the  time  and  effort  required  to  develop  databases.  In 
this  regard,  designing  a  data  dictionary  may  be  consid^^rel 
as  a  stepping-stone  to  the  implementation  of  a  DBMS. 

This  thesis  has  presented  a  relational  model  of  a 
dictionary  vhich  can  satisfy  the  needs  of  DISPULLAHTAD.  The 
advantages  of  this  model  are: 

1)  it  is  compatible  with  any  relational  DBMS  that 
DISPDILAHTAD  may  procure. 

2)  it  obviates  the  need  to  buy  a  DDS  in  addition  to  the 
EBHS. 

3)  it  can  be  tailored  specifically  to  DISPUIIAHTAD' s 
needs  because  of  the  flexibility  of  the  relational  model. 

4)  it  satisfies  the  criteria  for  a  DDS  (see  Figure  5. 1) 
In   order   to   attain   the   objective   of   managing   and 

controlling   data  resources,    the  following   implementation 
policy  is  recommended: 

•  All  personnel  involved  in  software  development  (such 
as  system  analysts,  programmers,  etc.)  should  use  the  data 
dictionary  extensively  in  doing  their  jobs. 

•  Only  the  data  administrator  staff  may  update  the  data 
dictionary,  others  may  access  it  in  read-only  mode. 
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•  All  suggestions  concerning  the  data  dictionary  may  be 
addressed  to  the  data  administrator  staff. 

This  thesis  has  stopped  short  of  suggesting  a  database 
design  for  DISPULL AHTAD' s  personnel  application.  Fcilcw-on 
vork  could  te  done  using  the  dictionary  model  suggested 
herein  as  a  foundation. 
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